Syntactic annotation of spontaneous speech: application to call-center conversation data

نویسندگان

  • Thierry Bazillon
  • Melanie Deplano
  • Frédéric Béchet
  • Alexis Nasr
  • Benoît Favre
چکیده

Both frameworks are based on the automatic semantic analysis of Human-Human spoken conversations. The semantic interpretation of a spoken utterance can be split into a two-level process: a tagging process projecting lexical items into basic conceptual constituents and a composition process that takes as input these basic constituents and combine them in a possibly complex semantic interpretation of the utterance, represented, for example, as a set of semantic Frames. Various methods, reviewed in [3], have been proposed for both levels of this process, from statistical tagging approaches to parsing methods. Syntactic information is useful to perform such an understanding process: at the concept level, syntax can help reducing the ambiguity through semantic role labelling; at the semantic Frame level, syntactic dependencies can be projected into semantic dependencies to obtain structured semantic objects. Despite its usefulness, syntactic parsing is not always considered when building a Spoken Language Understanding (SLU) system dedicated to process spontaneous speech because of two main issues: firstly transcriptions obtained through an Automatic Speech Recognition (ASR) process contain errors, the amount of errors increasing with the level of spontaneity in speech; secondly, spontaneous speech transcriptions are often difficult to parse using a grammar developed for written text due to the specificities of spontaneous speech syntax (agrammaticality, disfluences such as repairs, false starts or repetitions). The first issue is currently tackled in the DECODA project with the use of methods dealing with ambiguous inputs, such as word lattices produced by an Automatic Speech Recognition (ASR) system. The second issue is the target of this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adapting dependency parsing to spontaneous speech for open domain spoken language understanding

Parsing human-human conversations consists in automatically enriching text transcription with semantic structure information. We use in this paper a FrameNet-based approach to semantics that, without needing a full semantic parse of a message, goes further than a simple flat translation of a message into basic concepts. FrameNet-based semantic parsing may follow a syntactic parsing step, howeve...

متن کامل

DECODA: a call-centre human-human spoken conversation corpus

The goal of the DECODA project is to reduce the development cost of Speech Analytics systems by reducing the need for manual annotation. This project aims to propose robust speech data mining tools in the framework of call-center monitoring and evaluation, by means of weakly supervised methods. The applicative framework of the project is the call-center of the RATP (Paris public transport autho...

متن کامل

A Comparison between Three Methods of Language Sampling: Freeplay, Narrative Speech and Conversation

Objectives: The spontaneous language sample analysis is an important part of the language assessment protocol. Language samples give us useful information about how children use language in the natural situations of daily life. The purpose of this study was to compare Conversation, Freeplay, and narrative speech in aspects of Mean Length of Utterance (MLU), Type-token ratio (TTR), and the numbe...

متن کامل

Data-Driven Language Understanding for Spoken Language Dialogue∗

We present a natural-language customer service application for a telephone banking call center, developed as part of the AMITIES dialogue project (Automated Multilingual Interaction with Information and Services). Our dialogue system, based on empirical data gathered from real call-center conversations, features data-driven techniques that allow for spoken language understanding despite speech ...

متن کامل

Prosody in a corpus of French spontaneous speech: perception, annotation and prosody ~ syntax interaction

Our study focuses on the issue of prosodic annotation and of the prosody ~ syntax interface in conversation and is based on a large corpus of conversational speech in French. The results of inter-transcriber agreement tests show that two expert transcribers are consistent in their labeling of prosodic phrasing and the consistency is well above the chance. A qualitative analysis reveals transcri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012